Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses

نویسندگان

  • U. Martin Singh-Blom
  • Nagarajan Natarajan
  • Ambuj Tewari
  • John O. Woods
  • Inderjit S. Dhillon
  • Edward M. Marcotte
چکیده

Correctly identifying associations of genes with diseases has long been a goal in biology. With the emergence of large-scale gene-phenotype association datasets in biology, we can leverage statistical and machine learning methods to help us achieve this goal. In this paper, we present two methods for predicting gene-disease associations based on functional gene associations and gene-phenotype associations in model organisms. The first method, the Katz measure, is motivated from its success in social network link prediction, and is very closely related to some of the recent methods proposed for gene-disease association inference. The second method, called Catapult (Combining dATa Across species using Positive-Unlabeled Learning Techniques), is a supervised machine learning method that uses a biased support vector machine where the features are derived from walks in a heterogeneous gene-trait network. We study the performance of the proposed methods and related state-of-the-art methods using two different evaluation strategies, on two distinct data sets, namely OMIM phenotypes and drug-target interactions. Finally, by measuring the performance of the methods using two different evaluation strategies, we show that even though both methods perform very well, the Katz measure is better at identifying associations between traits and poorly studied genes, whereas Catapult is better suited to correctly identifying gene-trait associations overall [corrected].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prediction of MicroRNA-Disease Associations Based on Social Network Analysis Methods

MicroRNAs constitute an important class of noncoding, single-stranded, ~22 nucleotide long RNA molecules encoded by endogenous genes. They play an important role in regulating gene transcription and the regulation of normal development. MicroRNAs can be associated with disease; however, only a few microRNA-disease associations have been confirmed by traditional experimental approaches. We intro...

متن کامل

PREDICTION OF LOAD DEFLECTION BEHAVIOUR OF TWO WAY RC SLAB USING NEURAL NETWORK APPROACH

Reinforced concrete (RC) slabs exhibit complexities in their structural behavior under load due to the composite nature of the material and the multitude and variety of factors that affect such behavior. Current methods for determining the load-deflection behavior of reinforced concrete slabs are limited in scope and are mostly dependable on the results of experimental tests. In this study, an ...

متن کامل

Comparison of Different 2D and 3D-QSAR Methods on Activity Prediction of Histamine H3 Receptor Antagonists

     Histamine H3 receptor subtype has been the target of several recent drug development programs. Quantitative structure-activity relationship (QSAR) methods are used to predict the pharmaceutically relevant properties of drug candidates whenever it is applicable. The aim of this study was to compare the predictive powers of three different QSAR techniques, namely, multiple linear regression ...

متن کامل

Comparison of Different 2D and 3D-QSAR Methods on Activity Prediction of Histamine H3 Receptor Antagonists

     Histamine H3 receptor subtype has been the target of several recent drug development programs. Quantitative structure-activity relationship (QSAR) methods are used to predict the pharmaceutically relevant properties of drug candidates whenever it is applicable. The aim of this study was to compare the predictive powers of three different QSAR techniques, namely, multiple linear regression ...

متن کامل

Intelligent prediction of heating value of coal

The gross calorific value (GCV) or heating value of a sample of fuel is one of the important properties which defines the energy of the fuel. Many researchers have proposed empirical formulas for estimating GCV value of coal. There are some known methods like Bomb Calorimeter for determining the GCV in the laboratory. But these methods are cumbersome, costly and time consuming. In this paper, m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2013